AITopics | amodal completion

Collaborating Authors

amodal completion

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

bacadc62d6e67d7897cef027fa2d416c-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-10-2026, 02:20:03 GMT

amodal completion, camera-ready version, completion, (16 more...)

Neural Information Processing Systems

Genre: Research Report (0.30)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.35)

Add feedback

Occlusion-Aware Temporally Consistent Amodal Completion for 3D Human-Object Interaction Reconstruction

Doh, Hyungjun, Lee, Dong In, Chi, Seunggeun, Huang, Pin-Hao, Lee, Kwonjoon, Kim, Sangpil, Ramani, Karthik

arXiv.org Artificial IntelligenceSep-16-2025

We introduce a novel framework for reconstructing dynamic human-object interactions from monocular video that overcomes challenges associated with occlusions and temporal inconsistencies. Traditional 3D reconstruction methods typically assume static objects or full visibility of dynamic subjects, leading to degraded performance when these assumptions are violated-particularly in scenarios where mutual occlusions occur. To address this, our framework leverages amodal completion to infer the complete structure of partially obscured regions. Unlike conventional approaches that operate on individual frames, our method integrates temporal context, enforcing coherence across video sequences to incrementally refine and stabilize reconstructions. This template-free strategy adapts to varying conditions without relying on predefined models, significantly enhancing the recovery of intricate details in dynamic scenes. We validate our approach using 3D Gaussian Splatting on challenging monocular videos, demonstrating superior precision in handling occlusions and maintaining temporal stability compared to existing techniques.

artificial intelligence, deep learning, machine learning, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3746027.3754769

2507.08137

Country: North America > United States > Indiana > Tippecanoe County (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Contact-Aware Amodal Completion for Human-Object Interaction via Multi-Regional Inpainting

Chi, Seunggeun, Sachdeva, Enna, Huang, Pin-Hao, Lee, Kwonjoon

arXiv.org Artificial IntelligenceAug-4-2025

Amodal completion, which is the process of inferring the full appearance of objects despite partial occlusions, is crucial for understanding complex human-object interactions (HOI) in computer vision and robotics. Existing methods, such as those that use pre-trained diffusion models, often struggle to generate plausible completions in dynamic scenarios because they have a limited understanding of HOI. To solve this problem, we've developed a new approach that uses physical prior knowledge along with a specialized multi-regional inpainting technique designed for HOI. By incorporating physical constraints from human topology and contact information, we define two distinct regions: the primary region, where occluded object parts are most likely to be, and the secondary region, where occlusions are less probable. Our multi-regional inpainting method uses customized denoising strategies across these regions within a diffusion model. This improves the accuracy and realism of the generated completions in both their shape and visual detail. Our experimental results show that our approach significantly outperforms existing methods in HOI scenarios, moving machine perception closer to a more human-like understanding of dynamic environments. We also show that our pipeline is robust even without ground-truth contact annotations, which broadens its applicability to tasks like 3D reconstruction and novel view/pose synthesis.

completion, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.00427

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Bridging Perception and Language: A Systematic Benchmark for LVLMs' Understanding of Amodal Completion Reports

Watahiki, Amane, Doi, Tomoki, Shinozaki, Taiga, Nishida, Satoshi, Niikawa, Takuya, Miyahara, Katsunori, Yanaka, Hitomi

arXiv.org Artificial IntelligenceJul-9-2025

One of the main objectives in developing large vision-language models (L VLMs) is to engineer systems that can assist humans with multimodal tasks, including interpreting descriptions of perceptual experiences. A central phenomenon in this context is amodal completion, in which people perceive objects even when parts of those objects are hidden. Although numerous studies have assessed whether computer-vision algorithms can detect or reconstruct occluded regions, the inferential abilities of L VLMs on texts related to amodal completion remain unexplored. To address this gap, we constructed a benchmark grounded in Basic Formal Ontology to achieve a systematic classification of amodal completion. Our results indicate that while many L VLMs achieve human-comparable performance overall, their accuracy diverges for certain types of objects being completed. Notably, in certain categories, some LLaV A-NeXT variants and Claude 3.5 Sonnet exhibit lower accuracy on original images compared to blank stimuli lacking visual content. Intriguingly, this disparity emerges only under Japanese prompting, suggesting a deficiency in Japanese-specific linguistic competence among these models.

category, large language model, natural language, (20 more...)

arXiv.org Artificial Intelligence

2507.05799

Country:

Asia > Japan (0.28)
Europe > United Kingdom > England (0.28)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.70)

Add feedback

BLS-GAN: A Deep Layer Separation Framework for Eliminating Bone Overlap in Conventional Radiographs

Wang, Haolin, Ou, Yafei, Ambalathankandy, Prasoon, Ota, Gen, Dai, Pengyu, Ikebe, Masayuki, Suzuki, Kenji, Kamishima, Tamotsu

arXiv.org Artificial IntelligenceSep-11-2024

Conventional radiography is the widely used imaging technology in diagnosing, monitoring, and prognosticating musculoskeletal (MSK) diseases because of its easy availability, versatility, and cost-effectiveness. In conventional radiographs, bone overlaps are prevalent, and can impede the accurate assessment of bone characteristics by radiologists or algorithms, posing significant challenges to conventional and computer-aided diagnoses. This work initiated the study of a challenging scenario - bone layer separation in conventional radiographs, in which separate overlapped bone regions enable the independent assessment of the bone characteristics of each bone layer and lay the groundwork for MSK disease diagnosis and its automation. This work proposed a Bone Layer Separation GAN (BLS-GAN) framework that can produce high-quality bone layer images with reasonable bone characteristics and texture. This framework introduced a reconstructor based on conventional radiography imaging principles, which achieved efficient reconstruction and mitigates the recurrent calculations and training instability issues caused by soft tissue in the overlapped regions. Additionally, pre-training with synthetic images was implemented to enhance the stability of both the training process and the results. The generated images passed the visual Turing test, and improved performance in downstream tasks. This work affirms the feasibility of extracting bone layer images from conventional radiographs, which holds promise for leveraging bone layer separation technology to facilitate more comprehensive analytical research in MSK diagnosis, monitoring, and prognosis. Code and dataset will be made available.

layer image, overlap, overlapped region, (12 more...)

arXiv.org Artificial Intelligence

2409.07304

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Japan > Hokkaidō > Hokkaidō Prefecture > Sapporo (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

pix2gestalt: Amodal Segmentation by Synthesizing Wholes

Ozguroglu, Ege, Liu, Ruoshi, Surís, Dídac, Chen, Dian, Dave, Achal, Tokmakov, Pavel, Vondrick, Carl

arXiv.org Artificial IntelligenceJan-25-2024

Our approach capitalizes on diffusion models and transferring their representations to denoising diffusion models [14], which are excellent representations this task, we learn a conditional diffusion model for reconstructing of the natural image manifold and capture all whole objects in challenging zero-shot cases, including different types of whole objects and their occlusions. Due examples that break natural and physical priors, to their large-scale training data, we hypothesize such pretrained such as art. As training data, we use a synthetically curated models have implicitly learned amodal representations dataset containing occluded objects paired with their whole (Figure 2), which we can reconfigure to encode object counterparts. Experiments show that our approach outperforms grouping and perform amodal completion. By learning supervised baselines on established benchmarks. Our from a synthetic dataset of occlusions and their whole counterparts, model can furthermore be used to significantly improve the we create a conditional diffusion model that, given performance of existing object recognition and 3D reconstruction an RGB image and a point prompt, generates whole objects methods in the presence of occlusions.

completion, occlusion, segmentation, (13 more...)

arXiv.org Artificial Intelligence

2401.14398

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.37)

Add feedback

Image Amodal Completion: A Survey

Ao, Jiayang, Ke, Qiuhong, Ehinger, Krista A.

arXiv.org Artificial IntelligenceNov-7-2023

Existing computer vision systems can compete with humans in understanding the visible parts of objects, but still fall far short of humans when it comes to depicting the invisible parts of partially occluded objects. Image amodal completion aims to equip computers with human-like amodal completion functions to understand an intact object despite it being partially occluded. The main purpose of this survey is to provide an intuitive understanding of the research hotspots, key technologies and future trends in the field of image amodal completion. Firstly, we present a comprehensive review of the latest literature in this emerging field, exploring three key tasks in image amodal completion, including amodal shape completion, amodal appearance completion, and order perception. Then we examine popular datasets related to image amodal completion along with their common data collection methods and evaluation metrics. Finally, we discuss real-world applications and future research directions for image amodal completion, facilitating the reader's understanding of the challenges of existing technologies and upcoming research trends.

completion, dataset, segmentation, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.cviu.2023.103661

2207.02062

Country:

Oceania > Australia (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Overview (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
(3 more...)

Add feedback